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(57) Abrege/Abstract: 

This invention aims at reducing an total time required for document conversion by outputting an appropriate document data 
which matches a document type definition after conversion so as to omit a validity verification step in the document structure 
conversion. Specifically, this invention provides a document conversion method for converting a first structured document F1, 
formed based on a first document type definition D1, to a second structured document F3 , formed based on a second 
document type definition D2, the document conversion method comprises analyzing the document type definition D1 and 
document type definition D2 and extracting a different document type definition, generating a conversion template T2 described 
therein a conversion rule which prevents the structured document F3, which is the result of document conversion process, from 
being contradictory to the document type definition D2, based on the results of the analysis, and performing document 
conversion process using the conversion template 12. 
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TITLE OF THE INVENTION 
DOCUMENT CONVERSION SYSTEM, DOCUMENT CONVERSION METHOD AND 
COMPUTER READABLE RECORDING MEDIUM STORING DOCUMENT CONVERSION 

PROGRAM 

5 

CROSS REFERENCE TO RELATED APPLICATIONS 

This application is based upon and claims the benefit: of 
priority from the prior Japanese Patent Applications No. 
P2001-346736. filed on November 12 , 2001: the entire contents 
10 of which are incorporated herein by reference- 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates to a document conversion 

15 system for converting" a first structured document formed by a 
first document schema to a second structured document formed 
by a second document schema, a document conversion method ana 
a computer readable recording medium storing a document 
conversion program* 

20 2. Description of the Related Art 

Conventionally, the structured document which not only 
handles text data of text document files as mere character string 
but also is capable of expressing the logical structure of the 
document layout, attributes, etc has been proposed, For 

25 example, SGML specified by International Standardization 

Organization ( ISO) standard 8879 and XML specified by Woria wide 
Web consortium (W3C) are currently available. According to the 
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SGML and XML, the logical structure of a document is specified 
by document: type definition (DTD) and the rol^s of document 
component elements such as title, author's name, preface and 
text can be expressed using Identifier for structure elements 
5 called document tag. 

In the structured document , specific meaning or role, etc, 
may need to be assigned to the identifier and additional 
information (attributes) can be added to the identifier to 
express this characteristic. 
10 Further, the format of the stylesheet for describing the 

style of document, which is required for displaying the 
structured document on the screen and printing the structured 
document on paper, has been proposed. As the format or the 
stylesheet, for example, specification language (dsssl) or ISO 
IS standard 10179 and extensible stylesheet language (XSL) 
specified by W3C are available. 

DSSSL and XSL describe the document style by specifying 
a pattern for expressing the condition for the identifier 
constituting SGML or XML and an action corresponding to the 
20 identifier which satisfies that pattern. 

The stylesheet provides the document style and converts 
the structure of the document « The specification for extracting 
a particular pattern of tne structured document in XSL is called 
XSL transformation (XSLT) , The use of the xslt enable the XML 
25 document to be converted according to predetermined conditions 
and outputted in a different format such as html for example. 
The structured document is produced by dividing document 
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data (text) into units which have a meaning structurally and 
make these units using elements and attributes- in XML, the 
method for defining the structure of the document data is called 
schema and generally, document type definition (dtd) is used 
5 for defining the schema. The schema defines which elements 
should be possessed mwnat order and how many times as the content 
of the document and which attributes should he possessed as the 
content of the document . Since the struaturad document itself 
has no definition about data, it cannot automatically check for 
10 an error even i f data is missing for some reason . Thus , document 
type definition Is to be performed to display data or exchange 
data and the document needs to be described according to the 
definition. 

Fig. 1 shows an example flow of a conventional document 
15 conversion process for the structured document Fl which is 
described by the XML. As shown in the figure, generally, the 
conversion process of the structured document is comprised of 
2 steps, that is mainly conversion of document structure S101 
and its validity verification process S102. 
20 The conversion of document structure Sioi jls a step of 

generating a new document by extracting elements and attributes 
using a pattern matching technique and replacing them with new 
elements and attributes or by adding new elements, attributes 
and text . This process is performed based, on a conversion rule 
25 described in a conversion template Ti . The conversion template 
Tl contains a structure conversion rule which is generated as 
an XSL f ile ( conversion template Tl ) in advance . in the meantime , 
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as the XSLT conversion engine for the conversion of document 
structure process S101, the existing software (e.g., Xalan~C++) 
can be utilized. 

The validity verification process S102 is a step of 
S verifying whether the output {structured document F2) by the 
XSLT conversion process follows a document type definition D2 
after conversion and is performed using the document type 
definition D2 after conversion. The validity verification 
process S102 can be performed by the existing software (e.g., 
10 XML4C ) . If the result of the validity verification process S102 
is acceptable, a new structured document F3 is generated. If 
xt is not acceptable, document structure correction process S1Q4 
is performed for the structured document F2 based on the error 
content, and the validity verification process S102 is again 
15 performed fox- the corrected structured document F2, 

Fig. 2A is a diagram showing a conventional example for 
converting the structured document Fl defined by the document 
type definition Dl to the structured document F3 based on the 
conversion template Tl. In the figure, the structured document 
2D F2 after a first conversion (i) is contradictory to the document 
type definition D2 r and the structured document F3, in which 
che contradictions are corrected . In a document example of Fig . 
2A, UL element and ul element define a statement row without 
any number (list in random order} and each statement item is 
25 defined with LI element and 11 element which are lower order 
of UL and ul elements. 

As the element after the conversion, the ul element and 
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li sismant correspond to the UL eiemeni and Li eiemenr. in the 
struGtursd document Fl, a list comprising three statements is 
descrxbed- In the structured, document F2 containing 
contradictions, simply corresponding elements are replaced. 
5 If such a rule that only one 11 ©lament: can be defined 

under the ul element is specified in the document type definition 
D2, each 1± element is to be a sub -element of ul element (each 
11 element is enclosed by ul tag} in the structured document 
F2 - Consequently, it is corrected to an appropriate structured 
10 document P3 which satisfies the document type definition D2. 

Fxg. 2B is an example of a description of a conventional 
conversion template Tl . As: shown in The figure, the conversion 
template Tl acts as a conversion rule about conversion from the 
structured document Fi to the structured document F2 (i) 
15 containing contradictions - 

The conversion template Tl is comprised of a pattern 
assigning part and a template assigning part - 

Through conversion process, a document pattern (tag) 
defined by the pattern assigning part is extracted from the 
20 structured document. Further, addition, deletion and 

replacement are performed to the extracted document pattern 
according to the template assigning part m order to generate 
a new document . 

In the conventional conversion template Tl , each of 
2B <xsl: template match> . <xsls apply- template> . <xsl:value-of> is 
an element defined by the X5L specification, 

(1) and (3) using <xsl: template match> mean specifying 
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the pattern ana (1) means extraction of the ul element while 
(3) means extraction of the Li element. (2) and (4) mean 
specifying The template. The UL element is extracted according 
CO the pattern specifying of (X) ana then the template of (2) 
5 is specified. 

The specifying template of (2) means describing the start 
tag of ul and describing the termination tagr of ul after process; 
of applying a template rule to the LX element is performed* The 
template rules for the 1*1 element are (3) and (4), and the LI 
10 element is extracted according to the pattern specifying of ( 3 ) . 
Further, as the template specifying of (4), the start tag of 
li is described, a portion under the LI element is converted 
to text and finally the termination tag of li is described. Since 
there are three LI elements in the structured document PI, three 
15 portions corresponding to the pattern specifying of the above 
(3) are extracted. Further, the template specifying of (4) is 
applied respectively and then the process is complete. 

However, as described above, in a case where the document 
type definition Dl contains a contradiction with the document 
20 type definition D2 (e.g. , specification which is inhibited in 
the document type definition D2 ) , if only extracting 
elements /attributes according to the conversion template Tl and 
replacing (converting) to corresponding elements /attributes or 
adding such elements/attributes is performed, a contradiction 
25 with the document type definition D2 remains. 

According to the conventional structured document 
conversion method, both the document structure conversion 
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process Si 01 and the validity ver-if ication process S102 search 
elements /attributes Tram a route element to an end element in 
the document data. Therefore, there is a problem Thar the 
conversion of document takes longer time as the required times 
5 of the document correction process S104. 

Further, there is a problem that if a result of the validity 
verification process 5102 is not acceptable, an operator must 
manually perform a document correction process sio* in an 
off-line state based on the result of the validity verification 
10 process S102. 

BRIEF SUMMARY OF THE INVENTION 

It is therefore an abject of the present invention to reduce 
a total time required for document conversion by outputting an 

15 appropriate document data which matches a document type 

definition after conversion so as to omit a validity verification 
step in the document structure conversion. 

The present invention has a feature of , upon converting 
a. first structured document formed based on a first document 

20 schema into a second structured document formed based on a second 
document schema, analyzing the first document schema and the 
second document schema and extracting a different document type 
definition, generating a conversion template having described 
therein a conversion rule which prevents the second structured 

25 document, which is the result of a document conversion process, 
from being contradictory to the second document schema, based 
on the result of the analysis , and performing document conversion 
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process using the conversion template* 

According to the present invention, if there is an output 
logic which does nor satisfy the document type definition after 
conversion (second document schema), reflecting a process for 
correcting a contradiction with aconversion template f the second 
structured document which is a result of the document structure 
conversion process can be made appropriately according to the 
document type definition after conversion. As a result, a 
validity verification step after conversion, which is performed 
conventionally, can he omitted, thereby reducing a total time 
required for the document conversion. 

BRIEF DESCRIPTION OP THE SEVERAL VIEWS OF TKB DRAWINGS 

Fig. 1 is a schematic diagram showing the outline of a 
conventional document conversion method; 

Fig , 2A and 2B are diagrams showing an example of generation 
of a conventional conversion template; 

Fig. 3 is a schematic diagram showing the outline of a 
document conversion method of an embodiment of the present 
invention; 

Fig. 4A and 48 are diagrams showing an example of 
description of the conversion template according to the 
embodiment: of the present invention; 

Fig - 5A and 5B are diagrams showing an example a£ generation 
of other conversion template of the embodiment of the present 
invention; 

Fig - 6 A and 6B are diagrams showing an example of generation 
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Of other convearaion template of the embodiment o£ the presaat 
invention; 

Fig, 7 Is a schematic diagram showing the outline of the 
document: conversion method according to a modification o£ the 
5 embodiment of the present invention; 

Fig- 8 is a block diagram showing the configuration o£ 
a computer which a document conversion program of the embodiment 
is installed? 

Fig. 9 a flowchart showing process of the computer which 
10 the document conversion program of the embodiment is installed? 

Fig. 10 is a perspective view showing a computer readable 
recording medium in which the document conversion program of 
the embodiment is stored;. 

Fxg. 11 is a schematic diagram showing th© process of the 
15 computer which the document conversion program of the embodiment 
IS installed; and 

Fig, 12 is a schematic diagram showing the process of 
document conversion via a communication network using a computer 
in which the document conversion program of the embodiment is 
20 installed. 

Fig. 13 is a table showing the identifier correspondence 
table and conversion rule relating to the embodiment of the 
present invention. 



25 DETAILED DESCRIPTION OF THE INVENTION 

D ocument Conversion Method 

Hereinafter, the embodiments of a document conversion 
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method of the present: invention will be described. Fig, 3 is 
a schematic diagram showing the outline of the document 
conversion method of this embodiment - 

As shown in the figure, a conversion template T2 contains 
5 description of an appropriate conversion rule based on 

interpreting a document type definition Dl (first document 
schema) which is applied before the conversion and a document 
type definition D2 (second document schema) which is applied 
after the conversion for outputtlng a result according to the 
10 document type definition D2 „ In a document structure conversion 
process S101, the document structure of a structured document 
Fl (first structured document) which is a document before 
conversion is converted according to the description of the 
conversion template T2 in order to generate a new structured 
15 document F3 t second structured document). 

Such a conversion template T2 can be generated by the 
following procedure. In the meantzutne. according to this 
embodiment, the document type definition Dl and the document 
type definition D2 ore document data having on identifier (mark 
20 tag) for defining the logical structure of a character string 
of the document such as XML and HTML. 

Here, an identifier correspondence table and conversion 
rule are generated . Pig , 13 is a table which shows the identifier 
correspondence table and conversion rule relating xo this 
25 embodiment . 

As shown in Fig, 13, the identifier correspondence table 
is a table which indicates the relationship between the elements 
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for defining the same logical structure like the UL element ana 
the ul element: . The conversion rule is comprised of a replaceable 
template for defining the logical structure after conversion 
and the conditions for adapting the template. 
5 Tile Identifier correspondence table is generated baaed 

on the relationship between elements expressed in capital letters 
ana small letters or elements using arguments having the same 
content or elements having the same function. Following this 
identifier correspondence table, the logical structures before 

10 and after conversion are compared, and portions that differ 
between them are detected. For example , as shown in Fig. 2, 
the document type definition of the logical structure formed 
of the UL element and LI element in the structured document PI 
and the document type definition of the logical structure formed 

IS of the ul element and li element in the structured document F3 
are compared so as to detect differing portions . 

Further, the conditions of these detected differing 
portions are analysed . According to an example shown in Fig. 
if there are plural LI elements {two or more) , the UL element 

20 is nested with respect to each Li element. Therefore, in this 
example, (LI>»2) is adapted as the condition. Then, aconversion 
rule IS generated based on the conditions of the differing 
portions and the corresponding logical structure after 
conversion, and the conversion rule is reflected on th© 

25 conversion template T2. 

According to this embodiment , the conversion template T2 
is comprised of pattern specifying and template specifying . The 
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pat rem is for specifying axx identifier to be converted. Here, 
an identifier described in the identifier correspondence table 
is tne said identifier, tub template specifying the conversion 
rule in Fig. 13 is reflected and comprised of a template for 
replacing which defines ttie logical structure aftor the 
conversion, and the condition for adapting the replaceable 
template. 

Figs* 4(a) and 4(b) show the template rules T12, a?22 as 
an example of description of the conversion template T2 of ttiis 
embodiment. The example corrects the contradiction shown m 
Fig. 2 and the structured document F3 is outputted by a single 
conversion (Pig. 2(lii))- According to the template rule T12 
of this embodiment , < 5 ) and ( 7 ) indicate the pattern specifying - 
(5) describes the extraction of the UL clement, while (7) 
describes the extraction of the LI element. Further, (6) and 
(8) describe template specifying. 

In the example shown in Fig. 4A, firstly, the UL element 
is extracted and the template of (6) is specified according to 
ctie pattern specifying of (5) . The template specifying Of (6) 
means shifting an object wftlch a template is to be adapted from 
a current element (OL) to a sub-element (LI) , The template rule 
for the LI element is indicated by (7), (8). 

Next , the LI element is extracted by the pattern specifying 
of (7) . Then, by the template specifying of (8) , the start tag 
for ul is described, the start tag for li is described, a portion 
following the LI element is converted to text and described, 
Finally, the end tags of 11 and ul are described. 
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Since tha structured aocument PI before conversion has 
three LI elements as shown in Fig . 3 # thre© portions corresponding 
to the pattern specifying of (7) are extracted and the process 
of the template ©pacifying of (8) is performed so as to complete 
5 the process of conversion. 

According to the template rule T22 shown in Fig. 4B, 
^X5l;£or-each> is one of elements defined by the specification 
of XSL. (9) means the pattern specifying, which specifies the 
extraction of the UL element * (10 J means the template specifying » 
10 which specifies repeated process of plural LI elements . As for 
the content of the process, the start tag for ul is described, 
the start tag for 11 is described, a portion following the LI 
element is converted to teact and described and then, the end 
tags for 11 and ul are described. Since the structured document 
15 Fl contains three LI elements, the process by <xslsfor-each> 
element in tha template specifying of (10) is repeated for the 
three elements and then, the process is complete. 

Next, an example of another conversion template will be 
described- Figs- 5(a), lb) ore diagrams showing an example of 
20 conversion of the body element and blockquote element. Fig. 
SA shows the structured document F31 (first structured document ) 
which is a document loefore the conversion, the structured 
document F32 which is a document after the conversion containing 
contradictions , and structured document F33 (second structured 
2 3 document ) in which the contradiction is corrected m Fig . 3B shows 
a conventional conversion template ¥31 and the conversion 
template T32 of this embodiment. 
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In che example document, the body element and BODY element 
indicate the main bony of a document , and the blockquote elemeni 
and BLOCKQUOTE element specify displaying blook of character 
string for quotation . Although a div element specifies a block, 
to which the stylesheet is adapted, the stylesheet does not always 
nave to be adapted - 

According to this embodiment, as shown in the Fig- 13. 
the div element is used as an element capable of containing the 
body element and blockquote element* According to this 
embodiment, before and after conversion, the body element and 
blockquote element correspond to BODY element and BLOCKQUOTE 
element respectively. 

The structured document F31 indicates a character string 
below the BODY element as a document main body and further, the 
structured document F31 indicates a character sting block below 
rhe BLOCKQUOTE element as quotation. The structured document 
F32 containing the contradictions simply replaces corresponding 
elements . 

In the document type definition D2, if a rule that a 
character string cannot be described directly below the body 
element and blockquote element is specified, the structured 
document P32 is contradictory to the document type definition 
D2. The structured document F33 corrects that contradiction 
in the structured document F32 to satisfy the document type 
def xnition D2 by placing the div element to each of the body 
element and the blockquote element. 

Fig. 5B is an example of description of the conversion 
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cemplare rule. The conventional conversion template rule T31 
describes rhe conversion leroplate rule for conversion from the 
structured document F31 to the document P32 after conversion 
(lv) as shown in Pig. 5A- The corrected conversion template 
rule T32 describes the conversion rule for the conversion from 
the structured document F31 to the structured document F33 in 
which the contradiction is corrected (vi). 

According to the conventional conversion template rule 
T31, (11) and {13) msan the pattern specifying, respectively 
and (11) specifies extraction of the BODY element, while (13) 
specifies extraction of the BLOCKQUOTE element, (1Z) and (14) 
mean the template specifying, respectively. 

Firstly, the BODX element is extracted according to the 
pattern specifying of (11) and then the template of (12) is 
specified. Secondary, in the template of (12), the start tag 
for body is described and an object which the template is to 
adapt is shifted from the current element (BODY) to the 
sub-clement (BLOCKQUOTE). The template specifying of (12) 
means that the end tag for body is described after the process 
of the template rule for the sub-elamant ( BLOCKQUOTE ) is 
performed. 

The template rule for the BLOCKQUOTE element is indicated 
by ( 13 ) and ( 14 ) . The BLOCKQUOTE element is extracted according 
to the pattern specifying of (13) and -che template of (14) is 
specified . In the template of ( 14 ) , the start tag for blockquote 
is described and an object which the template is to adapt is 
shifted from the current element ( BLOCKQUOTE ) to the sub-element . 
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Partner, the template of (14) specifies describing the end rag 
for blockquote after the process of the template rule for the 
sub -element . 

According to the conventional conversion template rule 
T31, trie BODY element and the BLOCKQUOTE element are simply 
converted to the body element and blocKquote element. 

According to the conversion template rule T32 of this 
embodiment , ( 15 ) and ( 17 } mean the pattern respectively and ( 15 ) 
specifies extraction of the BODY element . while (17) specifies 
extraction of the BLOCKQUOTE element. (16 J and (18) mean the 
templare specifying respectively. Firstly, the BODY element 
is extracted according to the pattern specifying of (IS) and 
then the template of (16) is specified. Secondary, in the 
template of (16) , the start tag for body is described, the start 
tag for div is described, and an object which the template is 
to adapt is shifted from the current element (BODY) to the 
sub-element (BLOCKQUOTE). The template specifying of (16) 
means describing of the end tags for div and body as shown in 
Fig. 5B after the process of the template rule to the sub-element 
(BLOCKQUOTE) is performed- 

The template rule for the BLOCKQUOTE element is indicated 
by ( 17 ) and ( 18 ) . The BLOCKQUOTE element is extracted according 
to the pattern specifying of (17) and the template of (18) is 
specified . In Che template of ( 16 ) . the start tag for blockquote 
is described, the start tag for div is described, and an object 
which the template is to adapt is shifted from the current element 
(BLOCKQUOTE) to the sub-element . Further, the template 
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specifying of (18) means describing the end tags for div and 
blockquote as shown in Fig- SB after the process of the template 
rule to the sub-element is performed. By using the conversion 
template T32, the BODY element and the BLOCKQUOTE element ere 
converted to the fcody element and blockquote element respectively 
and Che div element can be placed in the body element and the 
blockquote element. 

Further, an example of generating the conversion template 
rule T2 according to this embodiment will he described. Pigs - 
6(a) and 6(b) are schematic diagrams of conversion examples with 
regard to the ol element and li element. Fig, 6A shows the 
structured document F41 which is a document hef ore conversion 
(first structured document) , the structured document F42 which 
is a document after conversion containing contradiction and the 
structured document F43 after conversion (second structured 
documen t ) in which the contradiction is corrected . Fig . 6B shows 
the conventional conversion template TAl and the conversion 
template T42 of this embodiment - 

The 01 element and OL element generate numbered statement 
block (order list) and each statement item is defined by the 
li element or Li element , which is the lower level of ol or OL 
element . The document F41 indicates an example of both a portion 
in which the LI element exists and a portion in which the LI 
element does not exist below the ol element. 

As shown inFig. 6A # the structured document F42 containing 
contradictions simply replace corresponding elements. In the 
document type definition after conversion, if a rule that at 
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least: one li element is required belowthe ol element: is specified, 
Che structured document F42 is contradictory to the document 
type definition after conversion. 

The structured document P43 corrects contradictions in 
the structured document F42 to satisfy the document type 
definition by replacing the ol element which has no li element 
with tne dlv element. 

Fig , 6B shows an example of the conversion template rule 
T42« The conventional conversion template rule T41 shown in 
Fig- 6B describes the conversion rule about conversion from the 
structured document F41 to the structured document F42 after 
conversion (vii) as shown in Fig. 6A. The conversion template 
rule T42 shown in Fig. 6B describes the conversion rule about 
conversion from the structured document F41 to the structured 
document F41 (ix) . 

As shown in Fig- 6, the conventional conversion template 
rule T41 is also comprised of the pattern for specifying 
extraction of the OL elament/the LI element and the template 
corresponding to each pattern. According to this conventional 
conversion template rule T*l. the OL element and LI element are 
simply converted to the ol element and li element. 

According to the conversion template rule T42 of this 
embodiment. (19) and (21) mean the pattern specifying, 
respectively. (1$) specifies extraction of the ol element, 
while (21 J specifies extraction of the LI element respectively. 
(20) and (22) indicate the template respectively. Firstly, the 
OL element is extracted according to the pattern specifying of 
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(19) ana then the template of (20) is specified. 

Eaaliof <;xsl:choose> , <xel:when;> , <xsl: otherwise* in Fig. 
6B are elements defined by the specification of XSL . The process 
is performed based on a combination of these three elements* 
If Che result of a conditional expression ( H count(LI) I « ' o' ") 
described in test attribute is true r the process in the element 
<xslswhen> is performed and if the result is false, the process 
in the element <;xsl = otherwis©> is performed. 

Under the conditional expression ( 51 count (LI ) l - * 0'"), 
the quantity of the LI elements is counted and if one or more 
LI elements exist, the result is? true. In this case, the start 
tag for ol is described according to the template of <xsl:Wben> 
element and than the process of the template rule to the 1,1 element 
is performed. After that, the end tag of ol is described. 

Further, according to the conditional expression 
( * count (LI ) I * '0 f if the quantity of the LI elements ie 0, 
the result is false. In this case, the start tag of div is 
described according to the template of <xslsotherwise> element 
ana then an object which the template is to adapt is shxf tea 
from the current element (01) to the sub-element. After the 
process of the template rule to the sub-element is performed, 
the end tag of div is described. According to the conversion 
template rule T42, if no li element exists below the ol element, 
the ol element can be replaced with the div element. 

The document conversion method of this embodiment 
described above allows modification as shown in Fig. 7. Fig. 
7 shows an example of conversion process in a case where a 
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structured document: nor following the XML, for example. 
compact-HTML document: for i-iuode [ Inf ormation service for 
cellular phone via the Internet) is used as the structured 
document before conversion (first structured document), xn 
this modification, shaping process S201 by using a. shaping tool 
is added to the above -described embodiment. 

In this example of modification, a document needs to follow 
the document type definition (DTD) of XML in order to activate 
the XSLT engine as a document structure conversion tool- Tiie 
XML document needs to have a declaration statement such as XML 
declaration and all the elements need to be described exactly 
in the nesting structure. Shaping process S201 is performed 
in order to shape a structured document Fl which is not based 
on the XML to follow the specification of XML (well -formed J _ 
In the shaping process S201 . the following proaess is performed. 

The content of the process is correcting the nest of the 
start tag and the end tag, adding the end tag if tne end tag 
is not attached and so on- Further, the content of tne process 
is inserting V If an empty clement exists (e.g., ?BR//)„ 
enclosing an attribute value with double quotation, adding an 
attribute value if the attribute value has been omitted, 
correcting the element name and attribute name to small letters 
and so on. 

As shown in Fig- 7, shaping process S201 is performed in 
order to shape tne structured document Fl before conversion to 
follow the specification of XML „ In the shaping process S201, 
free software (e.g. . HTML Tidy) con be used. Document structure 



20 



conversion S1QI is performed to a document shaped by The shaping 
process S201 in order to generate a new structured document F3 . 
The conversion template T2 describes an appropriate conversion 
rule by interpreting the document type definition Dl before 
conversion and the document type definition D2 in order to output 
a result according to the document type definition D2 after 
conversion . The process is complete once the document structure 
conversion 5101 is performed for conversion of the * shaped" 
Structure document Fl to a new structure document P3 - 

Document Converaion Program and Document Conversion 
System 

The above-mentioned document conversion method can be 
achieved by a personal computer or workstation which a program 
described by an appropriate computer language is installed. In 
a case wnere such a document conversion program is installed 
to a computer, that computer functions as a document conversion 
system. 

Fig. 8 is a block diagram showing the configuration of 
a computer 1 in which the document conversion program is installed. 
As Shown m the figure, the computer 1 comprises a hard disk 
11 , a printer interface 12 , a display interface 13 , an I/O device 
14, a memory 15, a communication device 16, a CPU 17 and a bus 
18 for connecting these devices, etc. 

The hard disk 11 is a recording medium which stores various 
kinds of data. Various kinds of data read via the X/O device 
14 is Stored in the hard disk 11 and the data is outputted to 
the memory 15 or the CPU 17 according to a request by the CPU 
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17. Further, (Sana, which is the result of processes in each 
device, is also storea xn the hard disk 11. This hard disJc 11 
stores document conversion program PI and the document conversion 
program PI is activated, and is controlled according to ttie control 
of the CPU 17. 

The printer interface 12 is a device for connecting the 
computer 1 to an external printer , etc. and performs file printing 
depends on a request from the CPU 17 , etc . The display interface 
13 displays images based on display data generated by the CPU 
17 and displays appropriate linages for control of the document 
conversion program PI or a result of various processes. 

The communication device 16 is a communication unit such 
as LAN card or a modem, wtiich connects the computer 1 to a 
communication network 20 such as the Internet, etc. via a 
communication line so as to transmit /receive data. The computer 
1 is capable of receiving data from external terminal or 
transmitting converted document file through the communication 
device 16 ♦ 

The I/O device 14 is a device for reading/writing data 
from/ to an external recording medium, such as a flexible disk 
drive and a CD-ROM drive. According to this embodiment, the 
conversion template T2, the document type definitions Dl. D2 
and the structured documents F1/F3 are inputtea/outputted. 

The memory 15 is a main memory device for storing data 
temporarily when the CPU 17 executes process. The memory 15 
holds data read out from the hard diak 11 or a result of processes 
executed by the CPU 17* 
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The CPU 17 Is a central processing unit, which functions 
as a document type definition analyzer 17a, a conversion template 
generator 17b , a document: structure converter 17c, a shapar 17d, 
a file I/O unit 17e f a communication processor 17f . a display 
5 data generator 17g and a printing processor I7h, toy executing 
the document conversion program PI read out from the hard dlsK 
11. 

The document type definition analyser 17a analyzes the 
document type definition Dl and the document type definition 

10 after conversion, and extracts a difference between these 

document type definitions. According to this embodiment, this 
document type definition analyzer 17a comprises an identifier 
correspondence table storing unit for storing the identifier 
correspondence table which the identifier of the document type 

15 definition before conversion and the identifier of the document 
type definition after conversion are linked , a logical structure 
extracting unit for extracting a first logical structure defined 
by the identifier of the document type definition Dl as well 
as a second logical structure defined by the identifier of the 

20 document type definition D2. and a condition detector which 
compares the first logical structure with the second logical 
structure according to the identifier correspondence table and 
analyzes the condition based on differing portions between the 
both structures - 

25 The identifier correspondence table storing unit can be 

achieved with a cache memory inside the CPU 17 ana the hard disk 
11 or the memory 15 can also be used as an auxiliary means. 
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The logical strucrure extracting unit reads data contained 
in the document type definitions Dl and D2 sequentially and 
verifies the data using Identifiers described in the identifier 
correspondence table- in a case where a matching identifier 
is detected, the logical structure extracting unit extracts its 
pattern by referring to a logical structure existing below the 
identifier. 

The condition detector compares rules specified for the 
document type definitions Dl and D2 before/after conversion so 
as to detect a condition which generates a difference. For 
example, the condition detector detects a condition where a 
difference in pattern occurs if however many LI elements exist 
below the XJL . 

The conversion template generator 17b generates a 
conversion template Tl according to a result of the document 
type definition analyzer 17a. The conversion template Tl 
describes a conversion rule for the structured document F2 which 
is a result of the document conversion to avoid any contradictions 
to the document type definition D2 - According to this embodiment , 
the conversion template generator 17b generates a conversion 
rule based on the aforementioned condition about the differing 
portions and its corresponding logioal structure after 
conversion (pattern extracted from D2). The conversion 
template generator 17b then correlates the identifier 
correspondence table with the conversion rule and converts them 
to the format of the conversion template. 

The document structure converter 17c processes the 
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document: conversion using the conversion template. The 
document structure converter replaces the identifiers described, 
in the identifier correspondence table and converts the argument 
attached to the identifier. Further, the document structure 
converter 17c adds, deletes and converts the logical structure 
Of an identifier which matches the aforementioned condition 
according to the template for replacing. 

The shaper I7d shapes the first structured document PI 
so as to enable conversion by the document structure converter 
17 c and corrects erroneous description in the structured document 
Fl ( this is not required for a shaped document . e.g., XML ) . More 
specifically, the shaper 17d corrects the nest of the start tag 
and the end tag , and adds the end tag if the end tag is not already 
attached . Further , the shaper 17d inserts * / 1 xf an empty element 
exists (e.g, , ;BR//), encloses an attribute value with double 
quotation, adds an attribute value if the attribute value has 
been omitted, corrects the element name and attribute name to 
small letters and so on. 

The file I/O unit I7e controls input/ output of a file and 
the operation of the hard diSK 11 as well as I/O device 14 . More 
specifically, the file I/O unit 17e reads the structured document 
Fl, the conversion template T2 , and the identifier correspondence 
-cable, etc- The file I/O unit 17b also stores the structured 
document P3 Ln the hard disK 11 and writes it into a flexible 
disk or a CD-R, etc, through the I/O device 14, Further, the 
file I/O unit 17e inputs or outputs each file to/from the 
communication processor I7f or printing processor 17h as 
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required. 

The communication processor 17£ controls the 
communication device 16 and. is connected, to the network 20 through 
trie communication device 16 so as to transmit /receive the 
Structured document PI and the structured document F3 to/from 
an external terminal. The communication processor 17f also 
receives a conversion request of a file from the other terminals 
through the communication device 16. 

The display data generator I7g generates image data for 
displaying on a screen and controls the display interface 13 _ 
image data is displayed on an external display unit through the 
display interface 13- This display data includes graphic data 
to be generated according to the document conversion program 
PI and the display data is used to display an image for control 
of each process and a review of each file. 

The printing processor 17h controls the printer interface 
12 to print the structured document F3 by an external printer. 
Operation 

The document conversion system can be achievedby executing 
Che document conversion program described above on a personal 
computer . etc . The operation or this document conversion system 
will he described with reference to Fig * 9 . Fig • 9 is a flowchart 
snowing the process of the document conversion system. 

As shown in Fig. 9 , the document type definition Dl before 
conversion la read out and analyzed ( S201 ) . More specifically, 
a file is read out from the I/O device 14 or the hard aisle 11 
and analyzed by the document type definition analyzer 17a. 
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Similarly, the document: type definition D2 after conversion is 
read out and analysed (S202). After that, the conversion 
template is generated (S203) . More specifically, the document 
type definition analyzer 17a analyzes the document type 
definition Dl /D2 and extracts a difference between these document 
type definitions . 

Next, the structured document Fl is read out (S204) „ the 
read-out structured document Pi is shaped (S205) if shaping is 
required and document structure of the shaped document is 
converted (S206). 

Then, the converted structured document F3 is output tea 
(S207), This output includes writing it into the I/O device 
14 or the hard disk 11 , transmitting it to the network 20 through 
the communication device 16 and printing it out through the 
printer interface 13. 

C omputer Readable Recording Medium Storing Document 
Conversion Program 

The above described document conversion program can be 
stored in a recording medium readable by the computer 1 . This 
computer readable recording medium includes, as shown in Fig. 
10. a flexible dick 216. a CD-ROM 217, a ROM 218, a magnetic 
tape 219, etc. 

As shown in Fig- 11 * the computer readable recording medium 
Storing such a document conversion program enables document 
conversion by using computer 30 such as a notebook type personal 
computer, a desk- top personal computer or a workstation. 

For example, in a case where the structured document Fl 
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which is to be converted is stored in a file as shown in Fig. 
II , such a structured document stored in a local disk is converted 
by the computer 30 in whicn the above -described document 
conversion program is installed, as a document converter. 

Although the above embodiment has been described about 
a case where both the hard disk 11 for storing the structured 
document Fl, P3 and the CPU 17 tor arithmetic operation, eta 
are incorporated in a single computer, the present invention 
is not restricted to this example. For example, the 
above- described respective devices can be decentralised on 
plural computers- 

Fig. 12 is a schematic diagram showing a case where the 
above described respective devices are decentralized on plural 
computers. As shown in the figure, the structured document Fl 
which is to be converted is stored in a content server 401 which 
is connected to the World Wide Web (WWW) ^ The structured document 
Fl can be converted by a conversion server 402 depend on a 
conversion request issued by a client terminal 403. 

In this case, the conversion server 402 in which the 
above-described document conversion program is installed is 
utilized. The conversion server 402 is connected to the 
communication network (e.g.- the Internet). The conversion 
server 402 comprises a receiving unit for receiving a conversion 
request from the client terminal 403 via the communication 
ne tworK and obtaining the structured document PI from the content 
server 401. The conversion server 402 also comprises a 
transmitting unit for transmitting the structured document F3 
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after conversion to the client: terminal device 403 via the 
communication network, The aDove-descrlbed communication 
device 16 can be used to function as the transmit ting unit and 
the receiving unit. 
5 As explained above, according to the present invention, 

since the validity verification step for document type definition 
after conversion is omitted by replacing with an appropriate 
conversion template in conversion of the structured document m 
a total time for the document structure conversion can be reduced . 

Xo The present invention has been described in detail by 

referring to the embodiments. It is obvious to those skilled 
in art that the present invention is not restricted to the 
embodiments mentioned above. The present invention may be 
carried out as a corrected or modified embodiment not departing 

is from the gist and scope specified by the scope of claim for a 
patent. Therefore, the description of this specification aims 
ac the representation of examples but does not have any limitation 
on the present invention. 
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What: is claimed is: 

I . A document conversion system for converting a. first: 
structured document: formed hased on a first document schema into 
a second structured document formed based on a second document 
schema, the document conversion system comprising % 

a document type definition analyzer for analyzing the first 
document schema and the second document schema and extracting 
a different document type definition; 

a conversion template generator for generating a 
conversion template having described therein a conversion rule 
whxch prevents the second structured document, which is the 
result of a document conversion process » f rom being contradict ory 
xo the second document schema, based on the results of the 
analysis performed by the document type definition analyzer: 
and 

a document structure converter for performing document 
conversion process using the conversion template - 
2. The document conversion system according to claim l. 
wherein the first document schema and the second document schema 
each have an identifier for defining the logical structure of 
a character string constituting a document, 

the document type definition analyser comprises r 
an identifier correspondence table storing unit for 
storing an identifier correspondence table which makes a 
correspondence between the identifier of the first document 
schema and the identifier of the second document schema: 
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a logical structure extracting unit for extracting a first 
logical structure defined by the identifier of the first document 
schema and a second logical structure aeflned by the identifier 
of the second document schema? and 

a condition detector for detecting that portions differ 
between the first logical struotura and the second logical 
structure by comparing both structures according to the 
identifier correspondence table and analyzing conditions 
generated by tne detected differing portions, and 

the conversion template generator which generates a 
conversion rule based on the condition of the detected differing 
portions and their corresponding second logical structure. 
3 „ The document conversion system according to claim 1 further 
comprising a file recorder for storing the first structured 
document and the second structured document as file data , wherein 

the document structure converter converts the first 
structured document read out from the file recorder. 
4 . The document conversion system according to claim 1 further 
comprising : 

a receiver which is connected to communication network 
for acquiring a conversion request and the first structured 
document from the communication network: and 

a transmitter for transmitting the second structured 
document converted by the document structure converter to the 
communication network. 

5 . The document conversion system according to claim 1 further 
comprising a shaper for correcting errors in the description 
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of the first structured document so that the first structured 
document can be read by the document structure converter* 
6 . A document conversion method for converting a first 
structured document formed based on a first document schema into 
a second structured document formed based on a second document 
schema, the document conversion method comprising the steps of s 
<A} analyzing the first document schema and the second 
document schema and extracting a different document type 
definition; 

(B) generating a conversion template having described 
therein a conversion rule which prevents the second structured 
document, Which is the result of a document conversion process, 
from being contradictory to the second document schema, based 
on the results of the analysis? and 

(C) performing document conversion process using the 
conversion template. 

7. The document conversion method according to claim 6, 
wherein the first document schema and the second document schema 
each have an identifier for defining the logical structure of 
a character string constituting a document, 
the step (A) comprises the steps of: 

(A-l) extracting a first logical structure defined by the 
identifier of the first document schema and a second logical 
Structure defined by the identifier of the second document 
schema: 

(A-2) detecting portions that differ between the first 
logical structure and the second logical structure by comparing 
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both structures according to an Identifier correspondence table 
which makes a corrasporiaence between the Identifier of the first 
document schema and the identifier of the second document: type: 
and 

(A- 3) analyzing conditions which are generated by the 
detected differing portions, and 

the step (B) is for generating a conversion rule based, 
on the condition of the detected differing portions and their 
corresponding second logical structure. 

8 . The document conversion method according to claim 6 , 
wherein the first structured document and the second structured 
document arc stored in a file recorder as file data, and 

the step (CJ is for converting the first structured 
document read from the file recorder. 

9 . The document conversion method according to claim s further 
comprising; 

a step of acquiring a conversion request and the first 
structured document from communication network, and 

a step of transmitting a converted second structured 
document to the communication network in the step (c) . 

10. The document conversion method according to claim 6, 
wherein the step (C) includes a step of correcting errors in 
the description of the first structured document so that the 
first structured document can be read. 

11. A computer readable recording medium storing a document 
conversion program which converts a first structursd document 
formed based on a first document schema into a seaond structured 
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document: formed "cased on a second document: schema and maxes a 
computer to execute a process comprising the steps of: 

(A j analyzing the first document schema and the second 
document schema and extracting a different document type 
definition? 

{ B) generating a conversion template having described 
Therein a conversion rule which prevents the second structured 
document , which is the result of a document conversion process, 
from being contradictory to the second document schema, based 
on the results of the analysis; and 

(C) performing document conversion process using the 
conversion template - 

12. The computer readable recording medium storing the 
document aonvarsion program according to claim 11, wherein tha 
first document schema and the second document schema each have 
an identifier for defining the logical structure of a character 
string constituting a document, 

the step tA) comprises the steps of e 

{ A-l ) extracting a first logical structure defined by the 
identifier of the first document schema and a second logical 
structure defined by the identifier of the second document 
schema t 

(A- 2) detecting portions that differ between the first 
logical structure and the second logical structure by comparing 
both structures according to an identifier correspondence table 
which makes a correspondence between the identifier of the first 
document schema and the identifier of the second document schema; 
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ana 

(&-3) analyzing conditions wnich are generated by the 
detected, differing portions, and 

the step (B) is for generating tne conversion rule based 
on the condition o£ the detected differing portions and their 
corresponding second logical structure. 

13 - The computer readable recording medium storing the 
document conversion program according to claim 11, wherein the 
first structured document and the second structured document 
are stored in a file recorder as file data- and 

the step (CJ is for converting the first structured 
document read from the file recorder . 

14* The computer readable recording medium storing the 
document conversion program according to claim 11 further 
comprising: 

a step of acquiring a conversion request and the first 
structured document from communication network, and 

a step of transmitting a converted second structured 
document to the communication networK in the step (C) • 
15, The computer readable recording medium storing the 
document conversion program according to claim 11. wherein the 
step (C J includes a step of correcting errors in the description 
of the first structured document so that the first structured 
document can be read. 
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ABSTRACT OP THE INVENTION 



Tills invention alms at reducing an total time required 
for document conversion toy outputting an appropriate document 
data which matches a document type definition after conversion 
so as to omit a validity verification stop in the document 
structure conversion. 

Specifically, this invention provides a document 
conversion method for converting a first structured document 
Fl, formed based on a first document type definition Dl, to a 
second structured document P3 , formed based on a second document 
type definition D2, the document conversion method comprises 
analyzing the document type definition Dl and document type 
def inlt ion D2 and extract xng a different document type definition, 
generating a conversion template T2 described therein a 
conversion rule which prevents the structured document P3 , which 
is the result of document conversion process, from being 
contradictory to the document type definition D2, based on the 
results of the analysis, and performing document conversion 
process using the conversion template T2. 
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<xsl:cemplatc match="UL">- 
<ul> 

^xsbapply-templates selecc= ,f Ll7> 
</ul> 
</xsl:template> 



<xsh template maicha"Lr> 

<hxxsl:value-of seleci=V7><Ai> 
</;csi:template> 
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FIG.4A 



(5) 



<xsl:templaie match="UL": 



<xsl:appiy~templates selects "LI7;> J 
</x&l:template> 



<xsl;template matcb*="Lr 
<ul> 

<lixxsl:value-of select= M ,7x/li> 
</ul> 
</xsl:iemplate> 
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<xsl:terapiaie match= n UL ,, > - 

<*sl:for-each select="U"> 
<al> 

<lbxxsl:value-of select=V7></li> 
</ul> 

</Ksi:for-each> 
</*sl:iernpIate> 
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